An Experimental Evaluation of the REE SIFT Environment for Spaceborne Applications
نویسندگان
چکیده
Few distributed software-implemented fault tolerance (SIFT) environments have been experimentally evaluated using substantial applications to show that they protect both themselves and the applications from errors. This paper presents an experimental evaluation of a SIFT environment used to oversee spaceborne applications as part of the Remote Exploration and Experimentation (REE) program at the Jet Propulsion Laboratory. The SIFT environment is built around a set of self-checking ARMOR processes running on different machines that provide error detection and recovery services to themselves and to the REE applications. An evaluation methodology is presented in which over 28,000 errors were injected into both the SIFT processes and two representative REE applications. The experiments were split into three groups of error injections, with each group successively stressing the SIFT error detection and recovery more than the previous group. The results show that the SIFT environment added negligible overhead to the application’s execution time during failure-free runs. Correlated failures affecting a SIFT process and application process are possible, but the division of detection and recovery responsibilities in the SIFT environment allows it to recover from these multiple failure scenarios. Only 28 cases were observed in which either the application failed to start or the SIFT environment failed to recognize that the application had completed. Further investigations showed that assertions within the SIFT processes— coupled with object-based incremental checkpointing—were effective in preventing system failures by protecting dynamic data within the SIFT processes.
منابع مشابه
Performance Evaluation of Local Detectors in the Presence of Noise for Multi-Sensor Remote Sensing Image Matching
Automatic, efficient, accurate, and stable image matching is one of the most critical issues in remote sensing, photogrammetry, and machine vision. In recent decades, various algorithms have been proposed based on the feature-based framework, which concentrates on detecting and describing local features. Understanding the characteristics of different matching algorithms in various applications ...
متن کاملMeasurement-Based Analysis of System Dependability Using Fault Injection and Field Failure Data
The discussion in this paper focuses on the issues involved in analyzing the availability of networked systems using fault injection and the failure data collected by the logging mechanisms built into the system. In particular we address: (1) analysis in the prototype phase using physical fault injection to an actual system. We use example of fault injection-based evaluation of a software-imple...
متن کاملDetailed Radiation Fault Modeling of the Remote Exploration and Experimentation (REE) First Generation Testbed Architecture
-The goal of the NASA HPCC Remote Exploration and Experimentation (REE) Project is to transfer commercial supercomputing technology into space. The project will use state of the art, low-power, non-radiationhardened, Commercial Off-The-Shelf (COTS) hardware chips and COTS software to the maximum extent possible, and will rely on Software-Implemented Fault Tolerance (SIFT) to provide the require...
متن کاملAquatic ecotoxicity of lanthanum - A review and an attempt to derive water and sediment quality criteria.
Rare earth elements (REE) used to be taken as tracers of geological origin for fluvial transport. Nowadays their increased applications in innovative environmental-friendly technology (e.g. in catalysts, superconductors, lasers, batteries) and medical applications (e.g. MRI contrast agent) lead to man-made, elevated levels in the environment. So far, no regulatory thresholds for REE concentrati...
متن کاملDetailed Radiation Fault Modeling of the Remote Exploration and Experimentation ( W E ) First Generation Testbed Architecture
-The goal f the NASA HPCC Remote Exploration and Experimentation (REE) Project is to transfer commercial supercomputing technology into space. The project will use state of the art, low-power, nonradiation-hardened, commercial Off-The-Shelf (COTS) hardware chips and COTS software to the maximum extent possible, and will rely on Software-Implemented Fault Tolerance (SIFT) to provide the required...
متن کامل